-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add key terms to use case intros/tutorial and what is dvc? docs [SEO] #1806
docs: add key terms to use case intros/tutorial and what is dvc? docs [SEO] #1806
Conversation
Reviewed Whas is DVC? for now ☝️ BTW, you should have access to push branches to this repo @jeremydesroches so no need to use a fork going forward 🙂. Using branches directly on |
Awesome. I will do that! Thanks @jorgeorpinel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of use cases index. Checking PR scope. Please generalize this feedback to the other docs before I get to review them.
that DVC can help with or improve. Our use cases are not written to be run | ||
end-to-end like tutorials. For more general, hands-on experience with DVC, | ||
please see our [Get Started](/doc/tutorials/get-started) instead. | ||
We provide short articles on common ML workflow and data science use cases that |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes there is an SEO motivation here: the search term is "data science use cases".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! Going fwd if you can make some notes in the PR file changes on terms each change is for, or a list of terms in the PR description at least, that would be helpful for reviews 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Definitely. That makes a lot of sense and I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter that probably users looking for "data science use cases" are not looking for DVC use cases? I don't want to assume what 1000s of people want, but it sounds like a basic data science question rather than anything to do with structuring DS projects (e.g. using DVC).
So maybe changes like this will bring more traffic but also up the bounce rate. We'll have to try and see, I guess!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter that probably users looking for "data science use cases" are not looking for DVC use cases?
It's true that the term is not a perfect match, but it is related to the primary subject area (data science). Most non-brand terms are going to be partially related but inexact, as searches for discovery are imprecise (because they don't know what DVC is yet).
The search engine is trying to fill in the gaps, so we want to expand on terms that are showing interest within the correct subject area in order to meet them halfway. This article already has some impressions for "use cases", including ML and data science so that's the motivation for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, cool! Keeping unresolved for future reference.
…s/dvc.org into seo-wk2-use-cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing all the changes:
content/docs/use-cases/versioning-data-and-model-files/index.md
Outdated
Show resolved
Hide resolved
content/docs/use-cases/versioning-data-and-model-files/tutorial.md
Outdated
Show resolved
Hide resolved
That's it! We've tracked the second version of the dataset, model, and metrics | ||
in DVC and committed the DVC-files that point to them with Git. Now let's look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a -> the: Actually 'a' is slightly more correct here.
- Thanks for the "them committed" fix 👍
- Let's now -> Now let's: What's the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! "a second version" is better. Made the change.
Reverted to "Let's now" — I tried some other versions of the second sentence when I rewrote the first one, and unintentionally switched them. (IMO "Let's now" is a better usage for following a result with a new instruction. 😉 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shcheklein would you like to review this first SEO PR too? |
@jeremydesroches @jorgeorpinel a lot of great improvements, guys! thanks! |
Hey @jorgeorpinel, I tried this earlier and got a permission error. Can you check if I have access? Thanks |
Sure, we'll check. |
> Add "data and model versioning", "versioning (large) data files", and "model versions"
These four pages were submitted for reindexing and added to a list of docs to track for ongoing SEO performance. |
OK, checking back on this, let's see if we can find anything on the Search Console now (over 10 days after merge). Starting with https://dvc.org/doc/user-guide/what-is-dvc,
@jeremydesroches could you please look at the other 3 pages and report whether there is any impact so far/ share links to check back later on? Any other comments/tips on this evaluation process are appreciated. Thanks! |
Hi @jorgeorpinel. No, a redirect won't do anything at this point because the old URLs aren't in the index anymore. You can check this for any given URL by clicking the magnifying glass (or searching for it in the bar at top on SC).
My assessment for each of the four pages is included below. The process I use is as follows:
Sorry, no way to smooth things out to weekly in SC. For a comparison period this short it helps (me, anyway) to see the peaks/valleys. The curves are nice but in the end the aggregate and term-level gain/loss are the key metrics for any given period. What is DVC? – Merge - Nov 2 vs. previous period Use Cases Index – Merge - Nov 2 vs. previous period Versioning Data and Model Files Index – Merge - Nov 2 vs. previous period Versioning Data and Model Files Tutorial – Merge - Nov 2 vs. previous period |
OK I'll take a look at your specific reports ASAP. for now I just wanted to follow up on my own mini report as actually now there is more of an impact both in what-is-dvc which seems to have an improved avg position consistently: But especially in the "machine leaning" term, which is starting to come alive: So good work with those even if the change is small so far. At least we know there is a measurable impact! |
Hi. Doing a final check on this, the first PR. Here are the results I can see: My 2 previous mini-reports continued the same: improvements in the secondary? metrics (impressions and position). For the other docs I just checked the weeks from the time they were merged vs. the same number of weeks (6) before that: What is DVC? - link - more impressions and better position, yet no more clicks. Maybe people just aren't that interested on this topic. Use Cases Index - link - This one actually has ~10% more impressions AND ~15% more clicks, so the CTR improved 👍 . It suggests to me that this is one of the sections with the most potential for growth.
Versioning Data and Model Files use case - link - similar as the previous one (even better clicks and CTR improvement)
Versioning Tutorial - link - also good improvements (around 25% in clicks, for example) So in general these are good results cc @shcheklein @jeremydesroches please start posting similar reports in your other PRs next week. Thanks |
Thanks for checking on this @jorgeorpinel. Yes, I’ll do the same for other PRs this week along with the other GSoD reports. |
* cases: [WIP] befin rewriting Versioning: explain why versioning large files is important/a thing per #1716 (comment) * cases: give some sense of why versioning data and models is important per #1747 (comment) * guide: why DVC is the way to Version data (sell philosophy) per #1747 (comment) * cases: add example section explaining why data versionig is is important, and how it looks with DVC per #1747 (comment) * cases: wrap up Versioning full draft * cases: rename demo section in Versioning, roll back checkout img, et al. * cases: some more versioning updates * cases: shorten versioning intro * cases: add bullet list of Versioning advantages per #1747 (comment) * cases: shorten Why DVC section in Versioning * term: data modeling -> data engineering per #1747 (review) * cases: make advantages section in Data Registry (consistency) * cases: make separate Versioned storage section * cases: rewrite intro and other changes to Versioning per #1747 (comment) * cases: cover gap between Versioning and (remote) storage, link to GS also per #1747 (comment) * use-cases: reapply SEO keyword changes from #1806 > Add "data and model versioning", "versioning (large) data files", and "model versions" * cases: make p about storage less overlapping to previous one per #1747 (review) * cases: add paragraph about versioning advantages before DVC's motivation per #1747 (review) * cases: simplify lists of advantages in Versioning (and Data Reg) rel #1747 (review) * cases: limitation->constraint (to avoid a redundancy) * guide: move DVC is not Git! from use cases to What is DVC? rel #1747 (review) * cases: ~~Summary of~~ Advantages (H2) * cases: rewrite parts of the DVC motivation paragraphs in Versioning * cases: improve vrsng intro and dedupe bullet lists * cases: rename Advantages sectino of vrsng per #1747 (review) * cases: expand on How it looks (vrsng) with focus on workspace per #1747 (review) * guide: improve DVC is not Git! section per #1747 (review) * cases: rename Versioning use case (why "Files"?) per #1747 (review) * cases: rewrite (again) the intro to vrsng per #1747 (comment) * cases: improve versioning intro (more coherent) * cmd: quick term update * cases: update links to Versioning use case * cases: refine Versioning intro, add proposed figure * cases: summarize, simplify, focus on the essence, et al. and propose new "Versioned storage" use case * cases: add redirect for new Versioning use case location * cases: merge How it looks + Version control sections * cases: simplify versioning-data-and-models#how-it-looks * Revert "redirect for new Versioning use case URL" 12bc7ed and put back the files and nav * cases: rewrite intro to improve motivation and post a draft figure proposal * cases: update Why DVC and benefits list based on https://docs.google.com/document/d/1jmvbsRC2JhzqAF0eTGu0tX9ydMNndiBviCHq5ezzfEY/edit * cases: actually revert URL change from recent commit * cases: more updates to the benefits bullets in Versioning * cases: rewrite How it looks (& feels) section * cases: remove non-essential info. from How it looks section of Versioning (a little aggressive) * cases: simplify How it looks per David and some of Ivan's feedback (remove cache mentions) * cases: remove H2s temporarily, simplify benefits bullet list, et al. * cses: rewrite benefit bullets and simplify how it feels section * cases: make bullet list into paragraph temp. * cases: wrap up Vrsng? (text) * cases: hardcode colums in How it feels section of Vrsng * cache: simplify it's structure explanation and add CAS term (from Vrsng use case) * guide: revert changes to this section for now * cases: polish latest iteration of Versioning use case * cases: next iteration of Versioning page per private feedback. Some issues may still be outstanding, will send smalles commits next * cases: polishing my last iteration of the Vsng page * remove a bunch of info from Vrsng to simplify again * cases: minor iteration of Vrsng, pending benefits list * guide: updates to What is DVC per #1747 (review) and #1747 (review) * cmd: roll-back unrelated changes (stashed elsewhere for now) per #1747 (review) and #1747 (review) * cases: work on benefits of Vrsng * cases: more work on benefits of Vrsng * cases: remove emojis; improve benefits list; add refs to other cases * cses: clarify about cache and about metafiles in Versioning * cases: simplify p about roll back/fwds; split benefit about data regs * cases: change BEFORE to be similar to the top fig. * cases: another iteration of Versioning * cases: simplify Versioning again * cases: improvements on Vrsng per direct feedback * cases: more updates to latest text and figures * cases: rephrase Vrsng benefits list * cases: revert to previous draft fig * cases: update 2nd figure draft, and reorder codification p * cases: rework Vrsng benefits and other small improvements and removed advanced topics (for a new section coming up) * cases: draft What's Next section added with advanced scenarios for Vrsng * cases: simplify 2nd figure * cases: make first Vrsg figure shorter * cases: merge advanced scenarios with benefits list * cases: roll back changes to Data Regs per https://github.com/iterative/dvc.org/pull/1747/files#r515533725 * cases: improvements per Dmitry's feedback... see #1747 (review) * cases: train_feats > features in figures for Vrsng * cases: rename Vrng Tutorial label in nav (use emoji) * cases: explain simple file naming a bit more per #1747 (comment) * cases: Vrng copy edits * cases: add efficient data mgmt benefit per #1747 (comment) * cases: reorder Vrsg benefits list per #1747 (comment) * cases: rewrite file naming and data mgmt benefits of Vrsg * cases: expand story to cover storage and data management and update benefits * cases: generalized Vrsg benefits * cases: separate data mgmt from versioning (through codification) in Vrsg * Make note about other guides, refs, and tutorial (Vrng) * cases: emphasize Simplicity benefit of Vrng is the opposite of "complicated" * cases: another rewrite of text and benefits * cases: copy edits to latest Vrng iteration, and append next steps paragaph (bottom) * cases: another iteration of Versioning use case * cases: clarify data mgmt is for data in Vrng benefits
OK @jorgeorpinel, I've updated the other GSoD PRs with images now too. Please check my notes above as I added some images and updates to your original findings for this PR. |
Based on existing search results, I expanded the use case intro docs with data and model versioning references. The tutorial received similar changes — pending review with @jorgeorpinel.
Merged PR #1805 (meant to separate commits but I messed it up 😒).
/docs/user-guide/what-is-dvc.md
Add explicit "machine learning" references including "version machine learning experiments"
/docs/use-cases/index.md
Add "data science use cases", "tools" and "best practices"
/docs/use-cases/versioning-data-and-model-files/index.md
Add "data and model versioning", "versioning (large) data files", and "model versions"
/docs/use-cases/versioning-data-and-model-files/tutorial.md
Add "ML model versions", "dataset versioning", "model and large dataset versioning", "machine learning models", "dataset and ML model versioning"
UPDATE: See results in #1806 (comment)